Skip to content

Stripe api test server & Tests & Bug Fixes#245

Merged
Yostra merged 28 commits intov2from
test_server
Apr 9, 2026
Merged

Stripe api test server & Tests & Bug Fixes#245
Yostra merged 28 commits intov2from
test_server

Conversation

@Yostra
Copy link
Copy Markdown
Collaborator

@Yostra Yostra commented Apr 4, 2026

Summary

Add a local, OpenAPI-driven Stripe list server (@stripe/sync-test-utils) so tests no longer depend on the remote rate-limited Stripe API. The server runs in-process against Postgres, implements list/retrieve for every endpoint discovered from the spec, and supports auth guards and fault injection.

Building the test infrastructure uncovered and fixed bugs in the production code.

Bug fixes

  • Unsupported query params sent to all endpointslimit, starting_after, and ending_before were sent to every endpoint regardless of whether the OpenAPI spec advertises them (e.g. reporting_report_types), causing 400 errors. Added buildSpecAwareListFn that filters params based on the spec.
  • Pagination on non-paginating endpointshas_more was trusted and starting_after was sent even for endpoints that don't support forward pagination, potentially causing infinite loops or data loss. Added supportsForwardPagination flag derived from the spec.
  • v2 created filters ignored — the source hardcoded supportsCreatedFilter: false for all v2 endpoints even when the spec declares created, skipping time-range filtering during segmented backfill.
  • v2 list functions didn't send created query paramsbuildListFn in listFnResolver.ts never wired params.created into v2 request URLs. Now encodes created[gte]/created[lt] with ISO timestamp conversion.
  • Non-OK API responses silently swallowedbuildListFn and buildRetrieveFn called response.json() without checking response.ok. A 401 or 500 was parsed as if it were valid data. Added assertOk + StripeApiRequestError to throw explicitly.
  • No retry on transient Stripe API errorsgetAccount, list, and retrieve calls had no retry logic. A single transient 500 from Stripe would fail the entire setup/sync. Added withHttpRetry (exponential backoff, retries on 429/5xx/network errors).
  • Sync errors silenced — service workflows marked pipelines as ready even when the sync had permanent errors. Added classifySyncErrors to distinguish transient vs permanent failures, with proper error propagation in the Temporal workflow.
  • Unhandled Postgres pool errorsdestination-postgres pool errors could crash the process. Added pool.on('error', ...) handler.

New package: @stripe/sync-test-utils

OpenAPI-driven local Stripe list/retrieve server backed by Postgres:

  • Discovers endpoints from the spec, creates tables, seeds with schema-generated fixtures
  • V1 and V2 list semantics (pagination, created filters, has_more / next_page_url)
  • Auth guards, fault injection hooks, optional OpenAPI query param validation
  • Docker Postgres helper for isolated test databases

New tests

  • test-server-all-api — syncs every stream for every supported API version (2020+) through the real engine against the local test server
  • test-server-sync — engine-level sync and state checkpoint tests: segmented backfill, boundary conditions, pagination, resume, empty segments, multi-stream, v2 cursor streams
  • test-sync-engine — failure mode coverage: bad API key, auth errors on individual streams, transient 500s with retry
  • test-e2e-network — network disruption and fault injection: pauses Stripe/Postgres/Temporal containers mid-sync, injects auth failures and transient 500s, validates recovery and error propagation through the full service stack
  • Unit testsresourceRegistry, client retry behavior, listFnResolver pagination flags and error propagation, index source setup/backfill flows

Other changes

  • Docker / CI hygiene: containers auto-remove, volumes clean up on exit
  • New e2e compose overlay (compose.e2e.yml)
  • OpenAPI object generator (objectGenerator.ts) for schema-based test fixture generation

@Yostra Yostra force-pushed the test_server branch 4 times, most recently from 898865c to 6b8a5a2 Compare April 8, 2026 02:14
@Yostra Yostra marked this pull request as ready for review April 8, 2026 02:50
@Yostra Yostra changed the title WiP testing Stripe api test server & Tests & Bug Fixes Apr 8, 2026
Copy link
Copy Markdown
Collaborator

@tonyxiao tonyxiao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review

Overall a high-quality PR — the buildSpecAwareListFn refactor, supportsForwardPagination propagation, StripeApiRequestError/assertOk, and withHttpRetry are all well-done. A few issues worth addressing before merge.

Critical

waitForErrorRecovery has a race with continueAsNew (pipeline-workflow.ts)

desiredStatusSignalCount is transient workflow-local state — it's not included in continueAsNew opts. After a history rollover while in errored state, the new run starts with desiredStatusSignalCount = 0, captures signalCount = 0, then the condition desiredStatusSignalCount > signalCount is immediately 0 > 0 = false. The workflow blocks waiting for a new signal, even if the operator already sent active before the rollover. Recovery should key on the desiredStatus value itself rather than a counter — or use a dedicated acknowledgeErrorSignal that is distinct from desiredStatusSignal.

system_error covers transient 500s, causing false permanent failures (sync-errors.ts)

PERMANENT_FAILURE_TYPES = new Set(['system_error', 'config_error']) — but the source emits failure_type: 'system_error' for any non-rate-limit HTTP error, including transient 500s that survived all withHttpRetry retries. A Stripe infrastructure outage lasting longer than the retry window would permanently error every pipeline and require operator intervention. Consider either narrowing PERMANENT_FAILURE_TYPES to only config_error, or introducing a distinct auth_error type for 401/403 so that system_error can stay transient at the Temporal layer.

Important

Mixed permanent+transient errors in one sync run silently drops the transient errors (pipeline-sync.ts)

If a single run produces both permanent and transient errors, the if (permanent.length > 0) branch returns early — the transient errors are never retried and never surfaced to the caller.

markPermanentError forces phase: 'backfilling' even when called from liveLoop (pipeline-workflow.ts)

When a live-event sync fails permanently, the pipeline may already be in phase: 'ready' (backfill complete). Resetting to 'backfilling' means after recovery the pipeline runs a full re-backfill instead of resuming live processing. markPermanentError should leave phase unchanged — errored: true alone is sufficient to halt processing.

New test suites permanently excluded from CI with no alternative run path (ci.yml)

All four new suites (test-server-all-api, test-server-sync, test-sync-e2e, test-sync-engine) plus test-e2e-network are excluded with no fallback — no nightly job, no label trigger. test-e2e-network in particular covers the recovery scenarios being added in this PR. If they're intentionally local-only, a comment in the YAML explaining why would help; if they're meant to run somewhere, a plan for where would be good.

Minor

  • requestWithRetry in client.ts silently skips retry for POST/DELETE — the name implies otherwise. A rename to requestWithGetRetry or a JSDoc noting the GET-only policy would prevent future callers from being surprised.
  • quoteIdentifier in storage.ts rejects hyphens (^[A-Za-z_][A-Za-z0-9_]*$), which are common in CI Postgres schema names. Since the function already quotes with "...", the validation is overly restrictive without adding security value in a test-only context.
  • test-sync-e2e.test.ts source configs are missing api_version (required per project conventions).

@Yostra Yostra force-pushed the test_server branch 2 times, most recently from 45a434a to 68ce297 Compare April 8, 2026 21:51
@Yostra Yostra merged commit 93526ea into v2 Apr 9, 2026
14 checks passed
@Yostra Yostra deleted the test_server branch April 9, 2026 01:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants